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1 Introduction 

There has been an explosion in the number of models proposed for understand- 
ing and interpreting the dynamics of financial markets. Broadly speaking, all 
such models can be classified into two categories: (a) models which charac- 
terize the macroscopic dynamics of financial prices using time-series methods, 
and (b) models which mimic the microscopic behavior of the trader population 
in order to capture the general macroscopic behavior of prices. Recently, many 
econophysicists have trended towards the latter by using multi-agent models 
of trader populations. One particularly popular example is the so-called Mi- 
nority Game [1], a conceptually simple multi-player game which can show 
non-trivial behavior reminiscent of real markets. Subsequent work has shown 
that - at least in principle - it is possible to train such multi-agent games 
on real market data in order to make useful predictions [2, 3, 4, 5]. However, 
anyone attempting to model a financial market using such multi-agent trader 
games, with the objective of then using the model to make predictions of real 
financial time-series, faces two problems: (a) How to choose an appropriate 
multi-agent model? (b) How to infer the level of heterogeneity within the 
associated multi-agent population? 

This paper addresses the question of how to infer the multi-trader het- 
erogeneity in a market (i.e. question (b)) assuming that the Minority Game, 
or one of its many generalizations [2, 3], forms the underlying multi-trader 
model. We consider the specific case where each agent possesses a pair of 
strategies and chooses the best-performing one at each timestep. Our focus is 
on the uncertainty for our parameter estimates. Using real financial data for 
quantifying this uncertainty, represents a crucial step in developing associated 
risk measures, and for being able to identify pockets of predictability in the 
price-series. 



2 Nachi Gupta, Raphael Hauser, and Neil F. Johnson 

As such, this paper represents an extension of our preliminary study in [6] . 
In particular, the present analysis represents an important advance in that it 
generalizes the use of probabilities for describing the agents' heterogeneity. 
Rather than using a probability, we now use a finite measure, which is not 
necessarily normalized to unit total weight. This generalization yields a num- 
ber of benefits such as a stronger preservation of positive definiteness in the 
covariance matrix for the estimates. In addition, the use of such a measure 
removes the necessity to scale the time-series, thereby reducing possible fur- 
ther errors. We also look into the problem of estimating the finite measure 
over a space of agents which is so large that the estimation technique becomes 
computationally infeasible. We propose a mechanism for making this problem 
more tractable, by employing many runs with small subsets chosen from the 
full space of agents. The final tool we present here is a method for removing 
bias from the estimates. As a result of choosing subsets of the full agent space, 
an individual run can exhibit a bias in its predictions. In order to estimate 
and remove this bias, we propose a technique that has been widely used with 
Kalman Filtering in other application domains. 

2 The Multi- Agent Market Model 

Many multi-agent models - such as the Minority Game [1] and its generaliza- 
tions [2, 3] - are based on binary decisions. Agents compete with each other for 
a limited resource (e.g. a good price) by taking a binary action at each time- 
step, in response to global price information which is publicly available. At 
the end of each time-step, one of the actions is denoted as the winning action. 
This winning action then becomes part of the information set for the future. 
As an illustration of the tracking scheme, we will use the Minority Game - 
however we encourage the reader to choose their own preferred multi-agent 
game. The game need not be a binary-decision game, but for the purposes of 
demonstration we will assume that it is. 

2.1 Parameterizing the Game 

We provide one possible way of parameterizing the game in order to fit the 
proposed methodology. We select a time horizon window of length T over 
which we score strategies for each agent. This is a sliding window given by 
(wk-T, ■ ■ ■ ,Wk-i) at time step k. Here Wk = -sgn(zfe) represents what would 
have been the winning decision at time k, and Zk is the difference in the 
corresponding price-series, or exchange-rate, Tk- 

Zk=r k -r k -\ (1) 

Each agent has a set of strategies which it scores on this sliding time-horizon 
window at each time-step. The agent chooses its highest scoring strategy as its 
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winning strategy, and then plays it. Assume we have N such agents. At each 
time-step they each play their winning strategy, resulting in a global outcome 
for the game at that time-step. Their aggregate actions result in an outcome 
which we expect to be indicative of the next price-movement in the real price- 
series. If one knew the population of strategies in use, then one could predict 
the future price-series with certainty - apart from a few occasions where ties 
in strategies might be broken randomly. 

The next step is to estimate the heterogeneity of the agent population 
itself. We choose to use a recursive optimization scheme similar to a Kalman 
Filter - however we will also force inequality constraints on the estimates so 
that we cannot have a negative number of agents of a given type playing the 
game. Suppose Xk is the vector at time-step k representing the heterogeneity 
among the TV types of agents in terms of their strategies. We can write this 
as 



Xk = 

_XN.k_ 

where we force each element of the vector to be > 0. 



(2) 



Xi, k >0,Vi (3) 

We provided a very similar scheme recently in [6], in which we had further 
constrained our estimate to a probability space and as a result also had to 
re-scale the time-series to better span the interval [— 1, 1]. We now relax the 
constraint condition (and the re-scaling) since the benefits of lying within the 
probability space are outweighed by the benefits of allowing the estimate to 
move out of this space. One significant benefit of staying within the proba- 
bility space is the ability to bound covariance matrices on errors (since upper 
bounds on the probability of certain events are known). On the other hand, 
staying constrained to a probability space removes one degree of freedom from 

our system (i.e. xn.j- = 1 — £i,fc ■ — ^n-i.fc)- This can cause the covariance 

of our estimate to become ill-formed by a possible numerical loss of posi- 
tive definitencss (which we could prevent by artificially inflating the diagonal 
elements of the covariance matrix) . 



3 Recursive Optimization Scheme 

In the following subsections we introduce the Kalman Filter, after which we 
will discuss some desirable extensions for our work. 



3.1 Kalman Filter 



A Kalman Filter is a recursive least-squares implementation, which makes 
only one pass through the data such that it can wait for each measurement 



4 Nachi Gupta, Raphael Hauser, and Neil F. Johnson 

to come in real time, and then make an estimate for that time given all the 
information from the past. The Kalman Filter holds a minimal amount of 
information in its memory at each time, yielding a relatively cheap compu- 
tational cost for solving the optimization problem. In addition, the Kalman 
Filter can make a forecast n steps ahead and provides a covariance structure 
concerning this forecast. The Kalman Filter is a predictor-corrector system, 
that is to say, it makes a prediction and upon observation of real data, it 
perturbs the prediction slightly as a correction, and so forth. 

The Kalman Filter attempts to find the best estimate at every iteration, 
for a system governed by the following model: 

Xk = i^fc-iZfc-i + Ufc,fc-i, u k ,k-i ~ N(0, Qk,k-i) (4) 
z k = H k x k + v k , v k ~N(0,R k ) (5) 

Here x k represents the true state of the underlying system, which in our case 
is the finite measure over the agents, i^.fe-i is a matrix used to make the 
transition from state x k ^\ to x k . In our applications, we choose F kt k-i to be 
the identity matrix for all k since we assume that locally the finite measure 
over the strategies doesn't change drastically. It would be a tough modeling 
problem to choose another matrix (i.e., not the identity matrix) - however, if 
desired we could incorporate a more complex transition matrix into the model, 
even one that is dependent on previous outcomes. The variable z k represents 
the measurement (also called observation). H k is a matrix that relates the 
state space and measurement space by transforming a vector in the state 
space to the appropriate vector in the measurement space. For our artificial 
market model, H k will be a row vector containing the decisions based on each 
of the agent's winning strategies. So x k , the measure over the agents, acts as 
a weighting on the decisions for each agent, and the inner product H k x k can 
be thought of as a weighted average of the agents' decisions - this represents 
the aggregate decision made by the system of agents. The variables u k ,k-i 
and v k are both noise terms which are normally distributed with mean and 
variances Qk,k-i and R k , respectively. 

The Kalman Filter will at every iteration make a prediction for x k , which 
we denote by x k \k-i- We use the notation k\k — 1 since we will only use 
measurements provided until time-step k — 1 in order to make the prediction 
at time k. We can define the state prediction error x k \ k _i as the difference 
between the true state and the state prediction. 

x k \k-i =Xk~ x k \k-i (6) 

In addition, the Kalman Filter will provide a state estimate for x k , given all 
the measurements provided up to and including time step k. We denote these 
estimates by x k \ k . We can similarly define the state estimate error by 



x k \k =x k - X k \k 



(7) 
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Since we assume Uk,k-i is normally distributed with mean 0, we make the 
state prediction simply by using F k k _i to make the transition. This is given 

by 

Xk\k-i = ^fe,fe-i^fc-i|fe-i (8) 

We can also calculate the associated covariance for the state prediction, which 
we call the covariance prediction. This is actually just the expectation of the 
outer product of the state prediction error with itself. This is given by 

Pk\k-1 = ^fc,fc-l-Pfc-l|fe-l-Pfe,fe-l + Qk,k-\ (9) 

Notice that we use the prime notation on a matrix throughout this paper to 
denote the transpose. Now we can make a prediction on what we expect to 
see for our measurement, which we call the measurement prediction, by 

h\k-i = Hk%k\k-\ (10) 

The difference between our true measurement and our measurement prediction 
is called the measurement residual, which we calculate by 

v k = z k - z k \k-i (11) 

We can also calculate the associated covariance for the measurement residual, 
which we call the measurement residual covariance, by 

Sk = H k P k \ k _ x H' k + R k (12) 

We now calculate the Kalman Gain, which lies at the heart of the Kalman 
Filter. This essentially tells us how much we prefer our new observed mea- 
surement over our state prediction. We calculate this by 

K k = P^Hfo 1 (13) 

Using the Kalman Gain and measurement residual, we update the state esti- 
mate. If we look carefully at the following equation, we are essentially taking a 
weighted sum of our state prediction with the Kalman Gain multiplied by the 
measurement residual. So the Kalman Gain is telling us how much to 'weight 
in' information contained in the new measurement. We calculate the updated 
state estimate by 

Xk\k = Xk\k-1 + K kVk (14) 

Finally, we calculate the updated covariance estimate. This is just the 
expectation of the outer product of the state error estimate with itself. Here 
we will give the most numerically stable form of this equation, as this form 
prevents loss of symmetry and best preserves positive defmiteness 



P k \ k = (I- K k H k )P k \k-i{l - K k H k y + K k R k K k (15) 
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The covariance matrices throughout the Kalman Filter give us a way to mea- 
sure the uncertainty of our state prediction, state estimate, and the measure- 
ment residual. Also notice that the Kalman Filter is recursive, and we require 
an initial estimate x |o and associated covariance matrix P |o- Here we simply 
provide the equations of the Kalman Filter without derivation. For a detailed 
description of the Kalman Filter, see Ref. [7]. 



3.2 Nonlinear Equality Constraints 

As we are estimating a vector in which each element has a non- negative value, 
we would like to force the Kalman Filter to have some inequality constraints. 
We now introduce a generalization for nonlinear equality constraints followed 
by an extension to inequality constraints. In particular, let's add to our model 
(Eqs. (4) and (5)) the following smooth nonlinear equality constraints 



efc(xfe) = 



(16) 



The constraints provided in Eq. (3) are actually linear. We present the non- 
linear case for further completeness here. We now rephrase the problem we 
would like to solve, using the superscript c to denote constrained. We are 
given the last prediction and its covariance, the current measurement and its 
covariance, and a set of equality constraints and would like to make the cur- 
rent prediction and find its covariance matrix. Let's write the problem we are 
solving as 

z° k = h%(x k )+v° k , v c k ~N(0,R c k ) (17) 

Here z k , h k , and v k are all vectors, each having three distinct parts. The 
first part will represent the prediction for the current time step, the second 
part is the measurement, and the third part is the equality constraint, z k 
effectively still represents the measurement, with the prediction treated as a 
"pseudo-measurement" with its associated covariance. 



Fk,k- 



ix k - 

Zk 





i|fc-i 



(18) 



The matrix hi takes our state into the measurement space as before 



K(xk) 



Xk 

H k Xk 
efc(x fe ) 



(19) 



Notice that by combining Eqs. (6) and (7), we can rewrite the state error 
prediction as 

x k \k-i — Fk,k-ix k -i\k-i + Ufc,fc-i (20) 
We can define w£ again as the noise term using Eq. (20). 
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Vu = 



—Fk,k-\Xk- 



1 1 fe— 1 - u k,k- 





(21) 



t>jr will be normally distributed with mean and variance R c k . The diagonal 
elements of R c k represent the variance of each element of v k . We define the 
covariance of the state estimate error at time-step k as Pk\k- Notice also that 
Rl contains no off-diagonal elements. 



Ri. 



Fk,k-lPk-l\k-lFk,k-l' + Qk.k-1 

R k 





(22) 



This method of expressing our problem can be thought of as a fusion of (a) 
the state prediction, and (b) the new measurement at each iteration, under 
the given equality constraints. As we did for the Kalman Filter, we will state 
the equations here. The interested reader is referred to Refs. [8, 9]. 



x k \k,j = [ 01 ] 



R% 


H k,3 


+ r 


rrc 1 
l n k,j 








z k K( x k\k,j-l) + H k,j X k\k,j-\ 

o 



(23) 



Throughout this paper, we use the + notation on a matrix to denote the 
pseudo-inverse. In this method we are iterating over a dummy variable j 
within each time-step, until we fall within a predetermined convergence bound 
|ifc|fc.j — %k\k.j-i\ < c fc or hit a chosen number of maximum iterations. We 
initialize our first iteration as x k \k,o — ^fc— i|fc— l and use the final itera- 
tion as Xk\k — Xk\k.j where J represents the final iteration. Notice that 
we allowed the equality constraints to be nonlinear. As a result, we define 
which gives us a local approximation to the direction of 



H. 



k,j 



dhf, / ~ 
lhcZ\ X k\k,j- 



A stronger form of these equations can be found in Refs. [8, 9], where i?£ 
will reflect the tightening of the covariance for the state prediction based on 
the new estimate at each iteration of j. We do not use this form and tighten 
the covariance matrix within these iterations since in the next section we will 
require the flexibility of changing the number of equality constraints between 
iterations of j. By not tightening the covariance matrix in this way, we are 
left with a larger covariance matrix for the estimate (which shouldn't harm 
us significantly). This covariance matrix is calculated as 



Pk\kj =- [o/] 







+ 


"0" 


V HC J 










(24) 



Notice that for faster computation times, we need only calculate Pk\k,j for the 
final iteration of j. Further, if our equality constraints are in fact independent 
of j, we only need to calculate H k ■ once for each k. This also implies that 
the pseudo-inverse in Eq. (23) can be calculated only once for each k. 
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This method, while very different from the Kalman Filter presented earlier, 
provides us with an estimate x k \ k and a covariancc matrix for the estimate 
P k \ k at each time-step, in a similar way to the Kalman Filter. However, this 
method allows us to incorporate equality constraints. 



3.3 Nonlinear Inequality Constraints 

We will now extend the equality constrained problem to an inequality con- 
strained problem. To our system given by equations (4), (5), and (16), we will 
add the smooth inequality constraints given by 

i fc (x fc ) > 0. (25) 

Our method will be to keep a subset of the inequality constraints active at 
any time. An active constraint is simply a constraint that we treat as an 
equality constraint. We will ignore any inactive constraint when solving our 
optimization problem. After solving the problem, we then check if our solution 
lies in the space given by the inequality constraints. If it doesn't, we start from 
the solution in our previous iteration and move in the direction of the new 
solution until we hit a set of constraints. For the next iteration, this set of 
constraints will be the new active constraints. 

We formulate the problem in the same way as before, keeping Eqs. (17), 

(18) , (21), and (22) the same to set up the problem. However, we replace Eq. 

(19) by 

Xk 



h c k {x k ) 



H k x k 
e k {x k ) 



(26) 



l k j represents the set of active inequality constraints. Although we keep Eqs. 
(18), (21), and (22) the same, these will need to be padded by additional zeros 
appropriately to match the size of l k j . Now we solve the equality constrained 
problem consisting of the equality constraints and the active inequality con- 
straints (which we treat as equality constraints) using Eqs. (23) and (24). 
Let's now call the solution from Eq. (23) x* k ^ k . since we have not yet checked 
if this solution lies in the inequality constrained space. In order to check this, 
we find the vector that we moved along to reach x^ k ■. This is simply 

d = x* k \k,j - Xk\k,j-i (27) 

We now iterate through each of our inequality constraints, to check if they 
are satisfied. If they are all satisfied, we choose £ max = 1. If they are not, we 
choose the largest value of i max such that x k \k,j-i +Wxd lies in the inequality 
constrained space. We choose our estimate to be 

x k\k,j — x k\k,j-l + t max d (28) 
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We also would like to remember the inequality constraints which are being 
touched in this new solution. These constraints will now become active for the 
next iteration and lie in 1% Note that i£ = l k l _ 1 j, where J represents the 
final iteration of a given time-step. We do not perturb the error covariance 
matrix from Eq. (24) in any way. Under the assumption that our model is a 
well-matched model for the data, enforcing inequality constraints (as dictated 
by the model) should only make our estimate better. Having a slightly larger 
covariance matrix is better than having an overly optimistic one based on a 
bad choice for the perturbation [10]. In the future, we will investigate how to 
perturb this covariance matrix correctly. 

In our application, our constraints are only to keep each element of our 
measure positive. Hence we have no equality constraints - only inequality con- 
straints. However, we needed to provide the framework to work with equality 
constraints before we could make the extension to inequality constraints. 

Figure 1 provides a schematic diagram showing how this optimization 
scheme fits into the multi-agent game for making predictions. 

3.4 Noise Estimation 

In many applications of Kalman Filtering, the process noise Qk,k-i an d mea- 
surement noise Rk are known. However, in our application we are not provided 
with this information a priori so we would like to estimate it. This can often be 
difficult to approximate, especially when there is a known model mismatch. 
We will present one possible method here which matches the process noise 
and measurement noise to the past measurement residual process [11]. We 
estimate Rk by taking a window of size Wk (which is picked in advance for 
statistical smoothing) and time-averaging the measurement noise covariance 
based on the measurement residual process and the past states. If we refer 
back to Eq. (12), we can simply calculate this by 

^ fc-i 

kk= Wk~T £ 'W'- I{ :i P Ji '"i (29) 

j=k-W k 

We can now use our choice of Rk along with our measurement residual co- 
variance Sk, to estimate Qk,k-i- Combining Eqs. (9) and (12) we have 

Sk = Hk(Fk,k-iPk-i\k-iFk,k-i' + Qk,k-i)Hk + Rk (30) 

Bringing all Qk,k-i terms to one side leaves us with 

HkQk.k-\Hk = Sk — HkFkPk-i\k-\Fk Hk — Rk (31) 

Solving for Qk,k-i gives us 



Qk.k-i = (Hk H k ) + Hk' (Sk - HkFkPk-\\k-\Fk 'Hk - Rk) Hk(Hk' H k ) + 

(32) 
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Fig. 1. Summary of the recursive method for predicting the heterogeneity of the 
multi-agent population. We have dropped the 'notation. Shown is a situation with 5 
types of agents, where each type has more than one strategy. They each score their 
strategies over the sliding time-horizon window (wk-r • ■ • Wfe— ») and choose the best 
one. represents the decisions they each make in this time-step, which in the 
case of a binary-decision game is +1 or -1. Taking the dot product of the frequencies 
over the agents and their decisions, we arrive at our prediction for the measurement. 
We then allow the recursion into the optimization technique. Since we chose F^.k-i 
as the identity matrix for all k, we omitted it entirely from this diagram. We also 
assume initial conditions are provided. In the next subsection, we describe how we 
arrive at the noise parameters Qk,k-i and Rk, which appear in the diagram. 



Note that it may be desirable to keep <3fc,fc-i diagonal if we do not believe 
the process noise has any cross-correlation. It is rare that one would expect 
a cross-correlation in the process noise. In addition, keeping the process noise 
diagonal has the effect of making our covariance matrix 'more positive def- 
inite'. This can be done simply by setting the off-diagonal terms of Qk,k-i 
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equal to 0. It is also important to keep in mind that we are estimating covari- 
ance matrices here which must be symmetric and positive semidefinitc, and 
the diagonal elements should always be greater than or equal to zero since 
these are variances. 

4 Estimation in the Presence of an Ecology of Many 
Agent Types 

It is very likely that we will come across multi-agent markets, and hence mod- 
els, with many different agent types, e.g., N > 100. As N grows, not only does 
our state space grow linearly, but our covariance space will grow quadratically. 
We quickly reach areas where we may no longer be in a computationally feasi- 
ble region. For example, if we look at strategies for agents playing the Minority 
Game and define a type of agent to have exactly 2 strategies, we see that as 
we increase the memory sizes of our agents our full set of pairs of strategies 
grows very quickly in relation to the memory size (e.g., m — 1 yields 2 2 = 4 
strategies and (2) = 6 pairs of strategies, m — 2 yields 2 2 = 16 strategies 
and (j 6 ) = 120 pairs of strategies, m = 3 yields 2 23 = 256 strategies and 
( 2 f ) = 32640 pairs of strategies, m — 4 yields 2 2 = 65536 strategies and 
( 65 2 36 ) = 2 1 4745 8 80 pairs of strategies, . . . ). If we were interested in simul- 
taneously allowing all possible pairs of strategies, our vectors and matrices for 
these computations would have a dimension that would not be of reasonable 
complexity, especially in situations where real-time computations are needed. 
In such situations, we propose selecting a subset of the full set of strategies 
uniformly at random, and choosing these as the only set that could be in play 
for the time-series. We can then do this a number of times and average over 
the predictions and their covariances. We would hope that this would cause a 
smoothing of the predictions and remove outlier points. In addition we might 
notice certain periods that are generally more predictable by doing this, which 
we call pockets of predictability. 

4.1 Averaging over Multiple Runs 

For each run j of our M runs, we have our predicted measurement at time 
k given by ikj and our predicted covariance for the measurement residual as 
Sk,j- Using the predicted measurements, we can simply average to find our 
best estimate of the prediction. 



M 




(33) 



Similarly, we can calculate our best estimate of the predicted covariance for 
the measurement residual: 
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M 

^ = £^ (34) 

It is important to note here that since z k and S k are both estimators, as M 
tends to oo, we expect the standard error of the mean for both to tend towards 
0. Also, note that we chose equal weights when calculating the averages; we 
could have alternatively chosen to use non-equal weights had we developed a 
system for deciding on the weights. 



4.2 Bias Estimation 



Since we are choosing subsets of the full strategy space, we expect that in 
some runs a number of the strategies might tend to behave in the same way. 
This doesn't mean that the run is useless and provides no information. In 
fact, it could be the case that the run provides much information - it is just 
that the predictions always tend to be biased in one direction or the other. 
So what we might like to do is remove bias from the system. The simplest 
way to do this is to augment the Kalman Filter's state space with a vector 
of elements representing possible bias [12]. We can model this bias as lying in 
the state space, the measurement space, or some combination of elements of 
cither or both. We redefine the model for our problem as 



k,k- 



+ u k ,k-i, u k , k -i ~ N(0, Qlk-i) 



Zk = H b k x b k + v k , v k ~N(0,R k ) 



(35) 
(36) 



where x\ represents the augmented state and bk is the bias vector at time 
step k 



x k 
bk 



(37) 



The transition matrix must also be augmented to match the augmented state. 
In the top left corner, we place our original transition matrix, and in the top 
right corner we place Bk,k-i representing how the estimated bias term should 
be added into the dynamics. In the bottom left we have the zero matrix so 
the bias term is not dependent on the state Xk, and in the bottom right we 
have the identity matrix indicating that the bias is updated by itself exactly 
at each time. 



k,k-l 



Fk,k- 




B kyk - 
I 



(38) 



Similarly, we horizontally augment our measurement matrix, where C k repre- 
sents how the bias terms should be added into the measurement space. 



H k = [H k Ck _ 



(39) 



For the process noise, we keep the off diagonal elements as 0, assuming no 
cross-correlations between the state and the bias. We also generally assume 
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no noise in the bias term and keep its noise covariance as 0, as well. Of course, 
this can be changed easily enough if the reader would like to model the bias 
with some noise. 

(40) 

We can take this bias model framework and place it into the inequality con- 
strained filtering scheme provided earlier, with the model given by Eq. (17), 
where we simply use the augmented states when necessary, rather than the 
regular state space (e.g. let Xk = x b k ). 

5 Example for a Foreign-Exchange Rate Series 

We now apply these ideas to a data set of hourly USD/ YEN foreign exchange 
rate data, from 1993 to 1994, using the Minority Game. We do not claim 
that this is a good model for the time-series, but it does contain some of the 
characteristics we might expect to see in this time-series. We look at all pairs 
of strategies with memory size m = 4 of which there are 2,147,450,880 as our 
set of possible types for agents. Since the size of this computation would not 
be tractable, we take a random subset of 5 of these types and use these 5 as 
the only possible types of agents to play the game. We perform 100 such runs, 
each time choosing 5 types at random. In addition, we allow for a single bias 
removal term. We could have many more terms for the bias, but we only use 1 
in order to limit the growth of the state space. We assume the entire bias lies 
in a shifting of the measurements, so we don't use a Bk t k-i from Eq. (38) and 
only choose Ck in Eq. (39) to be the identity matrix - or in our case simply 
1 since Ck is lxl. 

For the analysis of how well our forecasts perform, we calculate the residual 
log returns and plot these. Given our time-series, we can calculate the log 
return of the exchange rate as Ik = log(rfe) — log(rfc_i). Note that based 
on our definition for z k from Eq. (1), we can write the log return also as 
Ik = log(rfc_i +Zk) — log(rfc_i). Similarly, we can define our predicted forecast 
for the log return as Ik = log(rk-i+Zk) — log(rfc_i). Given these two quantities, 
we can calculate the residual of the predicted log return and the observed log 
return as Ik = h — h- Using the delta method [13], we can also calculate 
the variance of this residual to be / r ^\^ ■ We perform 100 such runs, which 
we average over using the method described in Section 4.1. A good test for 
whether our variances are overly optimistic is to check if the measure satisfies 
the Chebyshev Inequality. For example, we can check visually that no more 
than about | of the residuals lie within 3ct's. We plot the residual log return 
along with 3er bounds and the standard error of the mean in Figure 2. Visually, 
we would say the residuals in Figure 2 would certainly satisfy this condition if 
they were centered about 0. Maybe further bias removal would readily achieve 
this. 
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Fig. 2. We show the residuals of the log returns plotted with 3<r's centered about 
and the standard error of the mean over the 100 runs. Despite the one parameter 
bias removal, we still see a general bias in the data without which the residuals look 
much cleaner. Perhaps a more complex bias model would remove this. 



6 Conclusion 

This paper has looked at how one can infer the multi-trader heterogeneity in 
a financial market. The market itself could be an artificial one (i.e., simulated, 
like the so-called Minority Game) , or a real one - the technique is essentially 
the same. The method we presented provides a significant extension of pre- 
vious work [6]. When coupled with an underlying market model that better 
suits the time-series under analysis, these techniques could provide useful in- 
sight into the composition of multi-trader populations across a wide range of 
markets. We have also provided a framework for dealing with markets con- 
taining a very large number of active agents. Together, these ideas can yield 
superior prediction estimates of a real financial time-series. 
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